本文研究了基于K最近邻居算法的专家系统的效力,用于激光散斑图像采样应用于早期检测糖尿病。利用人工智能引导激光散斑成像技术的最新发展,可以优化与合适的AI技术相关联的激光参数,例如波长,能级和图像纹理测量,以有效地与皮肤组织的亚细胞特性相互作用检测糖尿病的早期迹象。由于其优化的激光物理学和AI技术的优化组合,新方法可能比典型的皮肤葡萄糖水平观察更有效,并且另外,它允许非专家个人进行更频繁的皮肤组织测试以进行早期检测糖尿病。
translated by 谷歌翻译
We propose a novel task, G4C (Goal-driven Guidance Generation in Grounded Communication), for studying goal-driven and grounded natural language interactions. Specifically, we choose Dungeons and Dragons (D&D) -- a role-playing game consisting of multiple player characters and a Dungeon Master (DM) who collaborate to achieve a set of goals that are beneficial to the players -- as a testbed for this task. Here, each of the player characters is a student, with their own personas and abilities, and the DM is the teacher, an arbitrator of the rules of the world and responsible for assisting and guiding the students towards a global goal. We propose a theory-of-mind-inspired methodology for training such a DM with reinforcement learning (RL), where a DM: (1) learns to predict how the players will react to its utterances using a dataset of D&D dialogue transcripts; and (2) uses this prediction as a reward function providing feedback on how effective these utterances are at guiding the players towards a goal. Human and automated evaluations show that a DM trained with RL to generate guidance by incorporating a theory-of-mind of the players significantly improves the players' ability to achieve goals grounded in their shared world.
translated by 谷歌翻译
Neural Networks (GNNs) have revolutionized the molecular discovery to understand patterns and identify unknown features that can aid in predicting biophysical properties and protein-ligand interactions. However, current models typically rely on 2-dimensional molecular representations as input, and while utilization of 2\3- dimensional structural data has gained deserved traction in recent years as many of these models are still limited to static graph representations. We propose a novel approach based on the transformer model utilizing GNNs for characterizing dynamic features of protein-ligand interactions. Our message passing transformer pre-trains on a set of molecular dynamic data based off of physics-based simulations to learn coordinate construction and make binding probability and affinity predictions as a downstream task. Through extensive testing we compare our results with the existing models, our MDA-PLI model was able to outperform the molecular interaction prediction models with an RMSE of 1.2958. The geometric encodings enabled by our transformer architecture and the addition of time series data add a new dimensionality to this form of research.
translated by 谷歌翻译
We present NusaCrowd, a collaborative initiative to collect and unite existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have has brought together 137 datasets and 117 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their effectiveness has been demonstrated in multiple experiments. NusaCrowd's data collection enables the creation of the first zero-shot benchmarks for natural language understanding and generation in Indonesian and its local languages. Furthermore, NusaCrowd brings the creation of the first multilingual automatic speech recognition benchmark in Indonesian and its local languages. Our work is intended to help advance natural language processing research in under-represented languages.
translated by 谷歌翻译
User and product information associated with a review is useful for sentiment polarity prediction. Typical approaches incorporating such information focus on modeling users and products as implicitly learned representation vectors. Most do not exploit the potential of historical reviews, or those that currently do require unnecessary modifications to model architecture or do not make full use of user/product associations. The contribution of this work is twofold: i) a method to explicitly employ historical reviews belonging to the same user/product to initialize representations, and ii) efficient incorporation of textual associations between users and products via a user-product cross-context module. Experiments on IMDb, Yelp-2013 and Yelp-2014 benchmarks show that our approach substantially outperforms previous state-of-the-art. Since we employ BERT-base as the encoder, we additionally provide experiments in which our approach performs well with Span-BERT and Longformer. Furthermore, experiments where the reviews of each user/product in the training data are downsampled demonstrate the effectiveness of our approach under a low-resource setting.
translated by 谷歌翻译
Adversarial attacks hamper the decision-making ability of neural networks by perturbing the input signal. The addition of calculated small distortion to images, for instance, can deceive a well-trained image classification network. In this work, we propose a novel attack technique called Sparse Adversarial and Interpretable Attack Framework (SAIF). Specifically, we design imperceptible attacks that contain low-magnitude perturbations at a small number of pixels and leverage these sparse attacks to reveal the vulnerability of classifiers. We use the Frank-Wolfe (conditional gradient) algorithm to simultaneously optimize the attack perturbations for bounded magnitude and sparsity with $O(1/\sqrt{T})$ convergence. Empirical results show that SAIF computes highly imperceptible and interpretable adversarial examples, and outperforms state-of-the-art sparse attack methods on the ImageNet dataset.
translated by 谷歌翻译
Quantifying motion in 3D is important for studying the behavior of humans and other animals, but manual pose annotations are expensive and time-consuming to obtain. Self-supervised keypoint discovery is a promising strategy for estimating 3D poses without annotations. However, current keypoint discovery approaches commonly process single 2D views and do not operate in the 3D space. We propose a new method to perform self-supervised keypoint discovery in 3D from multi-view videos of behaving agents, without any keypoint or bounding box supervision in 2D or 3D. Our method uses an encoder-decoder architecture with a 3D volumetric heatmap, trained to reconstruct spatiotemporal differences across multiple views, in addition to joint length constraints on a learned 3D skeleton of the subject. In this way, we discover keypoints without requiring manual supervision in videos of humans and rats, demonstrating the potential of 3D keypoint discovery for studying behavior.
translated by 谷歌翻译
Pragmatics is an essential part of communication, but it remains unclear what mechanisms underlie human pragmatic communication and whether NLP systems capture pragmatic language understanding. To investigate both these questions, we perform a fine-grained comparison of language models and humans on seven pragmatic phenomena, using zero-shot prompting on an expert-curated set of English materials. We ask whether models (1) select pragmatic interpretations of speaker utterances, (2) make similar error patterns as humans, and (3) use similar linguistic cues as humans to solve the tasks. We find that the largest models achieve high accuracy and match human error patterns: within incorrect responses, models favor the literal interpretation of an utterance over heuristic-based distractors. We also find evidence that models and humans are sensitive to similar linguistic cues. Our results suggest that even paradigmatic pragmatic phenomena may be solved without explicit representations of other agents' mental states, and that artificial models can be used to gain mechanistic insights into human pragmatic processing.
translated by 谷歌翻译
In the cybersecurity setting, defenders are often at the mercy of their detection technologies and subject to the information and experiences that individual analysts have. In order to give defenders an advantage, it is important to understand an attacker's motivation and their likely next best action. As a first step in modeling this behavior, we introduce a security game framework that simulates interplay between attackers and defenders in a noisy environment, focusing on the factors that drive decision making for attackers and defenders in the variants of the game with full knowledge and observability, knowledge of the parameters but no observability of the state (``partial knowledge''), and zero knowledge or observability (``zero knowledge''). We demonstrate the importance of making the right assumptions about attackers, given significant differences in outcomes. Furthermore, there is a measurable trade-off between false-positives and true-positives in terms of attacker outcomes, suggesting that a more false-positive prone environment may be acceptable under conditions where true-positives are also higher.
translated by 谷歌翻译
Dry Eye Disease (DED) is one of the most common ocular diseases: over five percent of US adults suffer from DED. Tear film instability is a known factor for DED, and is thought to be regulated in large part by the thin lipid layer that covers and stabilizes the tear film. In order to aid eye related disease diagnosis, this work proposes a novel paradigm in using computer vision techniques to numerically analyze the tear film lipid layer (TFLL) spread. Eleven videos of the tear film lipid layer spread are collected with a micro-interferometer and a subset are annotated. A tracking algorithm relying on various pillar computer vision techniques is developed. Our method can be found at https://easytear-dev.github.io/.
translated by 谷歌翻译